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ABSTRACT 

Background: Although recent genome-wide 
association studies have identified several genetic 
variants contributing to the complex aetiology of 
multiple sclerosis (MS), expression and functional 
studies are required to further understand its 
molecular basis. 

Objectives: To identify genes and pathways with 
differential expression in MS. 
Design: The authors conducted a systematic review of 
seven microarray studies, in which expression in 
immune cells was compared between MS patients and 
controls. These studies include a previously 
unpublished study, which is described here in detail. 
Results and conclusion: Although in general the 
overlap between studies was poor, 229 genes were 
found to be differentially expressed in MS in at least 
two studies, of which 11 were in three studies and 
HSPA1A in four studies. After excluding the authors' 
unpublished experiment which may have been affected 
by certain confounding factors and inclusion of treated 
subjects, 135 genes were identified in at least two 
studies. The differentially expressed genes were 
significantly associated with several immunological 
pathways, including interleukin (IL)-4, IL-6, IL-17 and 
glucocorticoid receptor signalling pathways. 15 of the 
229 loci have shown some association with MS in 
published genome-wide association studies 
(p<0.0001), including three loci with confirmed MS 
risk variants. 



ARTICLE SUMMARY 



Article focus 

■ To identify genes showing differential expression 
in multiple sclerosis through genome-wide 
expression profiling in peripheral blood mono- 
nuclear cells. 

■ To conduct a systematic review of genome-wide 
expression studies in multiple sclerosis in order 
to identify the most frequently reported genes. 

■ To identify pathways associated with genes most 
frequently reported as differentially expressed in 
multiple sclerosis. 

Key messages 

■ The vast majority of all genes reported as 
differentially expressed were only identified in 
a single study. 

■ However, 229 genes were reported as differen- 
tially expressed in MS to the same direction in at 
least two of the seven studies reviewed, 1 2 genes 
of which were in at least three studies. 

■ After excluding our unpublished experiment, 
which may have been affected by confounding 
factors and inclusion of treated subjects, 135 
genes were identified in at least two studies. 

■ The differentially expressed genes were signifi- 
cantly associated with several immunological 
pathways, including the IL-4, IL-6, IL-17 and 
glucocorticoid receptor signalling pathways. 



For numbered affiliations see 
end of article. 



Correspondence to 

Anu Kemppinen; 
ak635@medschl.cam.ac.uk 



INTRODUCTION 

The aetiology of multiple sclerosis (MS) is 
complex and involves both genetic suscepti- 
bility and environmental factors. However, 
apart from the widely replicated association 
with human leukocyte antigen (HLA)- 
DRB1*15QI, genetic risk factors have 
remained unknown until genome-wide asso- 
ciation studies (GWAS), which have recently 
led to identification of common MS risk 
variants in over a dozen loci. 1 6 Although 
further fine-mapping and functional studies 
are required in order to verify the causal 
variants and genes, a strong presence of 
immunological genes in these loci is evident. 
Expression and functional studies in immune 
cells can therefore elucidate the molecular 



mechanisms behind MS. Indeed, a number 
of studies have been conducted where 
genome-wide expression profiles in periph- 
eral immune cells were compared between 
MS patients and unaffected controls. 
Together, these studies have reported a large 
number of genes with differential expression 
in MS. However, given that most of these 
studies have been conducted in small 
samples without replication, it is likely that 
many of the findings are false positives. 
Approaches are therefore needed to increase 
the probability of detecting the true signals 
from the vast number of reported genes. In 
order to extract the genes which are more 
likely to be true positives, we systematically 
reviewed results from seven microarray 
studies in MS, including our previously 
unpublished study. First, we identified genes 
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ARTICLE SUMMARY 



Strengths and limitations of this study 

■ This is the first systematic review of genome-wide expression 
studies conducted in peripheral immune cells in multiple 
sclerosis. 

■ Strict criteria were applied for inclusion of studies, and clearly 
underpowered studies with fewer than 10 cases or controls 
were excluded. 

■ Many of the genes we found to be reported by at least two 
studies have interesting immunological functions and can be 
considered promising candidates for further studies. 

■ However, the studies included should not be considered 
directly comparable owing to differences in samples, platforms 
and analyses methods used. In addition, the majority of these 
studies are small and should be viewed with some caution. 

■ All studies were conducted in relatively heterogeneous cell 
populations, and some of the findings could therefore be 
explained by differences in numbers of different cell popula- 
tions rather than differential transcriptional activity in MS. 

■ Finally, our previously unpublished microarray study may have 
been affected by differences between the labs where the patient 
and controls samples were prepared for arrays, as well as by 
the higher mean age of controls. Our study also included four 
patients who had received immunomodulatory treatment at the 
time of sample collection. 



which had been found to be differentially expressed in 
MS to the same direction in at least two studies. In order 
to further examine the potential role of these most 
frequently reported genes, they were analysed using 
pathway tools. Finally, we searched these genes for 
evidence of association by making comparisons with top 
results from recent MS GWAS. 

MATERIALS AND METHODS 

Samples in the Finnish microarray experiment 

Twelve female patients fulfilling Poser's criteria for 
clinically definite MS were recruited through the Seina- 
joki Central Hospital. Fifteen healthy unrelated female 
controls were obtained from the Finnish Twin Study on 
Ageing (FITSA). The mean age was 54.2 in patients and 
71.6 in controls. One patient was receiving cortisone 
treatment, two patients received P interferon, and one 
patient was being treated with both (3 interferon and 
cortisone at the time of sample collection. All subjects 
had provided their informed consent. Peripheral blood 
mononuclear cells (PBMCs) were isolated from whole 
blood using BD Vacutainer CPT Cell Preparation Tubes 
(Becton, Dickinson and Company, Franklin Lakes, New 
Jersy) , and cells were disrupted and RNA extracted with 
TRIzol Reagent (Invitrogen, Carlsbad, California). RNA 
was then purified using Rneasy Mini Kit (Qiagen, 
Hilden, Germany), and the sample quality was examined 
using BioAnalyzer (Agilent, Santa Clara, California) . The 
study was approved by the Committee on Ethics of the 
Central Hospital of Central Finland and by the Helsinki 
University Hospital Ethical Committee of Ophthal- 
mology, Otorhinolaryngology, Neurology and Neuro- 



surgery (permit 192/E9/02) for FITSA and patient 
samples, respectively. 

Sample processing and microarrays 

Eleven patient samples were prepared for hybridisation 
on the Affymetrix GeneChip Human Genome U133 Plus 
2.0 Array (Affymetrix, Santa Clara, California) according 
to the manufacturer's recommendations in our labora- 
tory. In addition, one patient sample and technical 
replicates from two of the 11 patient samples were 
prepared according to the manufacturer's recommen- 
dations at the Helsinki Biomedicum Biochip Centre 
(BBC), where 15 control samples had been previously 
prepared. In brief, 1—2 (ig of total RNA was converted to 
biotin-labelled cRNA using the Affymetrix HT One-Cycle 
cDNA Synthesis Kit and the HT IVT Labelling Kit. 
Fifteen micrograms of cRNA was then fragmented and 
hybridised for 16 h at 45 °C, washed in Affymetrix 
Fluidics Station 450 and scanned with Affymetrix Gene- 
Chip Scanner 3000. Hybridisation, washing, staining and 
scanning were conducted using the same instruments 
for all samples. All arrays had a present call percentage 
>40 (42—47) and average background signal <50 
(36-44). 

Microarray data analysis 

Raw intensity data files were imported to GeneSpring 7.3 
(Agilent Technologies, Santa Clara, California) and GC 
Robust Multi-array Average (GC-RMA) normalised. We 
then applied filtering steps in order to exclude probe 
sets with low signal intensity and probe sets potentially 
affected by differences between our laboratory and the 
BBC. First, we excluded all probe sets with a GC-RMA 
normalised signal <50 in at least 20 of the 27 arrays 
(N=35 203). Second, we excluded probe sets where the 
technical replicates showed a > 1.4-fold difference in 
both replicate pairs (N=1668). Finally, we excluded 
probe sets where the signal in all three MS arrays 
prepared at the BBC ranked among the four lowest or 
four highest among the MS arrays (N=2469) , after which 
15 273 probe sets remained for analyses. After filtering, 
we discarded the two replicate arrays prepared in our 
laboratory and used only the replicates prepared at the 
BBC for final analyses, which thereby included 12 MS 
arrays and 15 control arrays. In order to identify genes 
with differential expression in MS, we first applied the 
fold-change filter in GeneSpring 7.3 using 1.5 as 
threshold. For probe sets showing a ±> 1.5-fold differ- 
ence in mean expression between MS patients 
and controls, we further determined non-parametric 
Mann— Whitney sum rank test p values with Benjami- 
ni— Hochberg correction for multiple testing. Probe 
sets with corrected a p value of <0.05 were considered to 
be differentially expressed. In order to annotate these 
probe sets, we compared the gene symbol obtained 
from GeneSpring 7.3 in October 2008 with the 
NetAffx annotation in December 2010 for each probe 
set (http://www.affymetrix.com/analysis/index.affx). If 
these were different, we first checked whether these were 
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alternative identifiers for the same gene. If not, we 
obtained the probe set nucleotide sequences from 
NetAffx and performed a Blat search in UCSC Genome 
Browser (hgl9) to identify the correct target gene 
(http:/ /genome.ucsc.edu/ cgi-bin/hgGateway) . Probe 
sets which recognise only intronic sequences or map to 
several loci were excluded from the list of differentially 
expressed probe sets. The Gene Expression Omnibus 
(GEO) accession number for the microarray dataset is 
GSE21942 (http://www.ncbi.nlm.nih.gov/geo/) . 

Pathway analysis 

Pathway analyses were conducted using the Core Analysis 
option in the Ingenuity Pathway Analysis software 
(Ingenuity Systems, Redwood City, California). This 
option identifies canonical pathways associated with 
a given list of genes by calculating the Fisher exact test p 
value for the probability that association between this set 
of genes and a canonical pathway is explained by chance 
alone. In order to account for the fact that our input lists 
of genes were enriched for immunological genes and 
would therefore show association with immune-related 
pathways if compared with all genes, we restricted the 
analyses to genes expressed in immune cells by applying 
the Tissues & Cell lines filter in the analysis settings. 

Systematic review of microarray studies in MS 

We searched PubMed with keywords 'multiple sclerosis 
microarray' in November 2010 and obtained 156 
records. These were complemented with two additional 
recent studies and studies identified through a review 
article. 7 9 Based on title, abstract and, if required, full 
text, we first identified all studies which had been 
conducted in peripheral immunological cells using 
a microarray platform and compared expression profiles 
between MS patients and unaffected controls. This led to 
the exclusion of 144 studies, which were not MS-related 
expression studies or investigated effects of MS treat- 
ments, were conducted in animal models for MS, or were 
performed in MS brain biopsies rather than immune 
cells. We then reviewed the remaining studies in further 
detail and excluded six studies with fewer than 10 MS 
patients and/ or controls, a study which did not identify 
any differentially expressed genes, a study where only 
genes involved in T-cell mediated cytotoxicity were 
included in the analyses and a study where the list of 
identified genes was not available in the publication or 
upon contact with the corresponding author. The stages 
of the selection process are depicted as a flow chart in 
figure 1 . Of the remaining eight studies which are listed 
in table 1, only six were independent, because the two 
studies by Satoh et a£ 15 16 and the studies by Achiron 
et al 10 and Mandel et al 13 had been conducted in the 
same set of patients and were from hereon considered to 
be single studies. After including our own unpublished 
experiment, we therefore had seven independent studies 
for the analysis. For each study, we listed all genes which 
had been reported to be differentially expressed in MS 
patients in comparison with healthy controls, and 
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Figure 1 Flow chart showing stages in selecting studies for 
systematic review. MS, multiple sclerosis. 

recorded whether their expression in MS was increased 
or decreased. The studies by Achiron et al w and Mandel 
et aZ 13 listed only a selected subset of the identified 
differentially expressed genes. 10 13 The corresponding 
author was contacted in order to obtain the full lists of 
differentially expressed genes, but we received no reply, 
and we therefore only included the reported genes. All 
gene symbols were mapped to the Human Gene 
Nomenclature gene symbols, and genes that were not 
unambiguously linked to a single gene symbol using the 
information available were excluded. 

RESULTS 

The previously unpublished microarray screen identifies 692 
probe sets with differential expression in MS 

We first analysed the data from our previously unpub- 
lished Finnish microarray screen and identified 692 
probe sets, which showed a si. 5-fold difference in mean 
expression between 12 MS patients and 15 controls 
together with a non-parametric Mann— Whitney sum 
rank test p value of <0.05 after Benjamini— Hochberg 
correction (supplementary table 1 ) . Three hundred and 
one probe sets showed increased expression, and 391 
decreased expression in MS. Pathway analysis revealed 
that the differentially expressed genes were strongly 
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associated with PI3K signalling in B lymphocytes 
(p=1.3E— 06), and B cell development (p=2.6E— 06) 
pathways. In addition, altered T cell and B cell signalling 
in rheumatoid arthritis, role of PKR in interferon 
induction and antiviral response, and production of 
nitric oxide and reactive oxygen species in macrophages 
pathways showed evidence of association with the 
differentially expressed genes (p<lE— 04). 

Systematic review identifies 229 genes reported in at least 
two studies 

In order to perform a systematic review of expression 
studies in MS, we identified eight previous studies, which 
filled our selection criteria (table 1). Studies which had 
been conducted in the same set of patients were 
combined, and we therefore considered six previously 
published independent lists of differentially expressed 
genes. Together with our previously unpublished 
experiment, these studies identified 2017 unique genes 
with increased expression and 1860 with decreased 
expression in MS, of which 303 genes were reported to 
have both increased and decreased expression. However, 
this list is not comprehensive, as two of the studies did 
not provide full lists of the identified genes. Two 
hundred and twenty-nine of the genes were found to 
have differential expression in MS to the same direction 
in at least two studies (supplementary table 2) and will 
be referred to as 'in silico replicated' differentially 
expressed genes (DEGs). One hundred and eleven of 
these had increased expression, and 119 decreased 
expression in MS, with one gene (STK4) identified as 
both. Eleven of the 229 genes were differentially 
expressed in three studies, and one gene, HSPA1A, in 
four studies (table 2). We acknowledge that our 
unpublished experiment may have been confounded by 
technical differences in processing of the patient and 
control samples, as well as by the age difference between 
cases and controls. In addition, the study included four 
patients who had received treatment, which may also 
have had an impact on the results. After excluding 



our study, 135 of the 229 DEGs were identified in at 
least two independent studies. These are indicated in 
supplementary table 2. 

Several immunological pathways are significantly 
associated with the in silico replicated DEGs 

In order to explore whether the in silico replicated DEGs 
are associated with specific pathways, we applied again 
the canonical pathway analysis option in the Ingenuity 
Pathway Analysis software. We conducted pathway anal- 
ysis on the 135 DEGs identified in at least two indepen- 
dent studies excluding our unpublished study, and on 
the 229 DEGs identified when including our study. 
Unsurprisingly, many of the 20 most significantly asso- 
ciated pathways are related to immunological functions 
(tables 3, 4), with the most significantly associated 
pathway in both analyses being the glucocorticoid 
receptor signalling pathway. Other top-rated pathways 
include molecular mechanisms of cancer, NF-KB signal- 
ling and several interleukin (IL) signalling pathways 
(IL-4, IL-6, IL-17). 

FIFTEEN IN SILICO REPLICATED DEGS ARE SUGGESTIVELY 
ASSOCIATED WITH MS 

We then investigated whether the 229 in silico replicated 
DEGs showed any evidence for association in five previ- 
ously published GWASs or in a recently published GWAS 
meta-analysis after excluding single nucleotide poly- 
morphisms (SNP) in the human leukocyte antigen 
region (6p21— 22). 1 6 All association p values were avail- 
able for the GWASs by Jakkula et at and International 
Multiple Sclerosis Genetics Consortium (IMSGC), 1 and 
we therefore first examined the p value distribution of 
SNPs mapping within 100 kb of the 229 DEGs by quan- 
tile— quantile plots. Interestingly, these SNPs showed an 
overall enrichment of p values in the range 0.1—0.001 in 
the IMSGC GWAS (figure 2) . However, we did not see any 
evidence of enrichment in the Finnish GWAS. We then 
reviewed the reported SNPs in the five published GWASs 
and the GWAS meta-analysis, and found 15 of the 229 in 



Table 2 


Genes identified as up- or down-regulated in multiple sclerosis in at least three studies 






Direction of change in 


Gene 


Description 


expression in multiple sclerosis 


ATP7A 


ATPase, Cu 2+ transporting, alpha polypeptide 


Decreased (FIN 11 15 16 ) 


CCL3 


Chemokine (C-C motif) ligand 3 


Decreased 1 1-1 3 


CDKN1C 


Cyclin-dependent kinase inhibitor 1C (p57, Kip2) 


Decreased (FIN 9 11 ) 


HSPA1A 


Heat shock 70 kDa protein 1A 


Decreased 9 12 13 15 16 


PLAUR 


Plasminogen activator, urokinase receptor 


Decreased 1 1-1 3 


EIF4A1 


Eukaryotic translation initiation factor 4A1 


Increased (FIN 9 11 ) 


NEAT1 


Nuclear paraspeckle assembly transcript 1 (non-protein coding) 


Increased (FIN 9 11 ) 


OGT 


O-linked N-acetylglucosamine (GlcNAc) transferase 


Increased (FIN 11 12 ) 


PTGS2 


Prostaglandin-endoperoxide synthase 2 (prostaglandin G/H 


Increased 11-13 




synthase and cyclo-oxygenase) 


Increased (FIN 9 11 ) 


RBBP6 


Retinoblastoma-binding protein 6 


TNFAIP3 


Tumour necrosis factor, alpha-induced protein 3 


Increased (FIN 11 15 16 ) 


ZMYND8 


Zinc finger, MYND-type containing 8 


Increased (FIN 11 12 ) 


FIN, previously unpublished Finnish microarray screen described in detail here. 



Kemppinen AK, Kaprio J, Palotie A, etal. BMJ Open 2011;1:e000053. doi:1 0.1 136/bmjopen-201 1-000053 



5 



Systematic review of genome-wide expression studies in multiple sclerosis 



Table 3 Top 20 pathways associated with the 229 in silico replicated differentially expressed genes 



Pathway 


No of differentially expressed 

genes in the pathway/total Fisher exact 

no of genes in the pathway test p value 


Glucocorticoid receptor signalling 


30/250 


2.1E- 


-16 


IL-6 signalling 


15/88 


6.0E- 


-11 


Hepatic fibrosis/hepatic stellate cell activation 


16/116 


3.6E- 


-10 


Molecular mechanisms of cancer 


24/303 


1.8E- 


-09 


IL-17 signalling 


12/70 


4.9E- 


-09 


Pancreatic adenocarcinoma signalling 


14/102 


5.1E- 


-09 


Nuclear factor (NF)-k(3 signalling 


16/154 


2.4E- 


-08 


IL-10 signalling 


11/65 


2.5E- 


-08 


Peroxisome Proliferator-Activated Receptor (PPAR) signalling 


12/81 


2.7E- 


-08 


Role of osteoblasts, osteoclasts and chondrocytes in rheumatoid arthritis 


17/189 


7.5E- 


-08 


Germ cell— Sertoli cell junction signalling 


14/138 


2.5E- 


-07 


Dendritic cell maturation 


14/139 


2.7E- 


-07 


Apoptosis signalling 


11/86 


4.9E- 


-07 


p38 Mitogen-Activated Protein Kinase (MAPK) signalling 


11/89 


6.9E- 


-07 


Colorectal cancer metastasis signalling 


16/206 


1.3E- 


-06 


Atherosclerosis signalling 


10/77 


1.4E- 


-06 


Tumor Necrosis Factor Receptor 1 (TNFR1) signalling 


8/45 


1.4E- 


-06 


Regulation of IL-2 Expression in activated and anergic T lymphocytes 


10/79 


1.8E- 


-06 


Peroxisome Proliferator-Activated Receptor Alpha (PPARa)/Retinoid X 


13/142 


2.2E- 


-oe 


Receptor Alpha (RXRa) activation 








IL-4 signalling 


9/64 


2.4E- 


-06 



IL, interleukin. 



silico replicated DEGs (including regions 100 kb up and genes, CDK4, ZLTRand TNFRSF1A, have been confirmed 
downstream) to contain suggestively associated SNPs with genome-wide significance in GWASs or their follow- 
(p^O.0001) (table 5). The risk variants near three of the up studies (p^BXlO -8 ). 1 3 5 17 

Table 4 Top 20 pathways associated with the 135 in silico replicated differentially expressed genes after excluding the Finnish 
experiment 

No of differentially expressed 



Pathway 


genes in the pathway/total 
no of genes in the pathway 


Fisher exact 
test p value 


Glucocorticoid receptor signalling 


22/250 


3.7E- 


-14 


IL-17 signalling 


9/70 


7.0E- 


-08 


Pancreatic adenocarcinoma signalling 


10/102 


1.8E- 


-07 


IL-6 signalling 


9/88 


5.2E- 


-07 


IL-2 signalling 


7/52 


1.5E- 


-06 


Phosphatase and Tensin Homolog (PTEN) signalling 


9/101 


1.7E- 


-06 


IL-15 signalling 


7/60 


4.1E- 


-06 


Agrin interactions at neuromuscular junction 


7/60 


4.1E- 


-06 


Tumor Necrosis Factor Receptor 1 (TNFR1) signalling 


6/45 


9.4E- 


-06 


p21 -Activated Protein Kinase (PAK) signalling 


7/75 


1.9E- 


-05 


Phosphoinositide 3-Kinase (PI3K)/AKT signalling 


8/108 


2.5E- 


-05 


Regulation of IL-2 expression in activated and anergic T lymphocytes 


7/79 


2.6E- 


-05 


Molecular mechanisms of cancer 


13/303 


3.0E- 


-05 


PPAR signalling 


7/81 


3.1E- 


-05 


Aryl hydrocarbon receptor signalling 


8/115 


4.0E- 


-05 


Role of Janus Kinase 1 (JAK1) and Januse Kinase 3 (JAK3) in yc 


6/59 


4.6E- 


-05 


cytokine signalling 








Nuclear factor (NF)-k(3 signalling 


9/154 


5.2E- 


-05 


Stress-Activated Protein Kinase (SAPK)/c-Jun N-terminal Kinase 


7/90 


6.1 E- 


-05 


(JNK) signalling 








Renal cell carcinoma signalling 


6/64 


7.3E- 


-05 


IL-4 signalling 


6/64 


7.3E- 


-05 



IL, interleukin. 
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Figure 2 Quantile— quantile plot for 
Cochran— Mantel— Haenszel p values in the International 
Multiple Sclerosis Genetics Consortium genome-wide 
association studies highlighting single nucleotide 
polymorphisms (SNP) in genes showing differential expression 
in multiple sclerosis. SNPs in 6p21— 22 were excluded. Red 
dots represent SNPs mapping within 100 kb of in silico 
replicated differentially expressed genes. The remaining SNPs 
are represented by black dots. Plot A shows the entire p value 
distribution, and plot B is a close-up showing the enrichment of 
observed p values in the range 0.1—0.001. 

Suggestively associated variant correlates with CXCR4 
expression 

To address the question, whether the above identified 
suggestively associated SNPs (p<0.0001) mapping to 
15 in silico replicated DEGs might contribute to 
the differential expression of these genes in MS, we 
first examined the genes using the mRNA by SNP Browser 
(http://www.sph.umich.edu/csg/liang/asthma/), 18 19 
which is a database of expression QTT (eQTL) SNP 
variants. However, none of the suggestively associated 
SNPs or their tagging SNPs (r 2 >0.8) correlated signifi- 
cantly with the expression of the corresponding DEG 
(p<0.001). We also investigated the expression of these 
genes in risk allele carriers versus non-carriers in 
lymphoblastoid cell lines of 60 Centre d'Etude du Poly- 
morphisme Humain (CEPH)-derived HapMap samples 
(CEU) (GEO dataset GSE5859) 20 and obtained geno- 
types for the most strongly associated SNP, or SNPs if 
identified in several GWASs, from HapMap Release 24 
(http://www.hapmap.org). 21 In CXCR4, we tested 
rs882300, which was the most strongly associated CXCR4 
SNP in a meta-analysis, instead of the two SNPs mapping 
within 100 kb of the gene. We did not identify any 
significant differences in expression between risk allele 
carriers and non-carriers after correcting for multiple 
testing. However, we found suggestive evidence for 
a higher expression of CXCR4 in carriers of the associated 
G allele at rs882300 (uncorrected one-sided p value=0.04, 
fold change=1.36) (figure 3), which is in concordance 
with the higher expression of CXCR4 observed in MS in 
our Finnish microarray study and three other studies, 
including two which were excluded from the systematic 
review owing to small sample sizes. 11 22 23 This effect was 
also seen with the same probe set (2l7028_at) in our 
microarray data in a combined sample of 1 1 MS patients 
and 14 controls for which a genotype was available 



(fold change 1.19, one-sided Mann— Whitney test 
p value=0.06). However, no significant difference in 
expression levels was observed in two other probe sets 
measuring for the expression of CXCR4 in our microarray 
data. Both of these probe sets recognize exonic 
sequences, while 217028_at identifies 3'UTRin CXCR4. 

DISCUSSION 

Despite extensive research and recent successes in 
identifying genetic risk variants predisposing to MS, the 
underlying molecular mechanisms still remain poorly 
understood. Given the suggested role of autoimmunity 
and predominance of immunological genes in loci 
associated with MS, genome-wide expression profiling in 
immune cells is a valid approach for further elucidating 
genes and pathways involved in the disease pathogenesis. 
Although several microarray experiments in MS have 
been conducted, and a large number of differentially 
expressed genes have been reported, in most cases the 
samples have been small, and replication has been 
lacking, making the findings difficult to interpret. While 
no obvious consistencies have emerged from these 
studies, there have not been any systematic attempts to 
evaluate the overlap between them. We therefore 
conducted a systematic review of seven microarray 
studies including our previously unpublished study and 
confirmed that the general overlap between the studies 
was indeed poor: only 229 of all 3574 (6%) genes 
reported to be differentially expressed in MS had been 
identified in at least two studies, and 94% were therefore 
unique to a single study. Only 12 of the 229 DEGs were 
identified in at least three studies. These include NEAT1, 
which encodes for a non-coding RNA and suppresses the 
expression of CIITA, an activator of genes within the 
major histocompatibility complex (MHC) class II 
locus. 24 Interestingly, the most frequently reported gene, 
HSPA1A, which showed decreased expression in MS in 
four studies, is also functionally connected to the MHC: 
it encodes for a heat shock protein, which is likely to be 
involved in MHC class I and II mediated antigen 
presentation. 25 The gene itself is located in the MHC 
class III region next to a highly homologous HSPA1B 
gene, and the measured expression levels may reflect the 
expression of both genes. 

However, as is the case for expression studies in 
general, the studies are not directly comparable owing to 
differences in samples, sample sizes and platforms, as 
well as in criteria used for data quality control and for 
declaring differential expression. Furthermore, the 
majority of all reported genes came from the only two 
studies conducted in whole blood samples, and it is 
therefore not necessarily surprising that most of these 
genes were not identified in studies conducted in PBMCs 
or lymphocyte populations. Two studies also reported 
only a subset of the identified genes, and some studies 
were conducted using microarrays covering only a frac- 
tion of currently known human genes. However, perhaps 
the most likely explanation for poor overlap across 



Kemppinen AK, Kaprio J, Palotie A, etal. BMJ Open 201 1 ;1:e000053. doi:1 0.1 1 36/bmjopen-201 1 -000053 



7 



Systematic review of genome-wide expression studies in multiple sclerosis 



Table 5 Non-human leukocyte antigen single nucleotide polymorphisms (SNP) in silico replicated differentially expressed 
genes with p<0.0001 in a genome-wide association study 



Gene 



SNP 



Chr:bp position 
(hg18) 



Genome-wide 
p Value association study 



Direction of change in 
expression in multiple 
sclerosis 



ANXA1 



CD40 



CDK4 



CXCR4 



C7orf54 

GNG2 

IL7R 

ITPR1 
NPEPPS 



PAK2 
TGFBR2 
TNFAIP3 
TNFRSF1A 

TRIB2 
ZMIZ1 



rs 13292677 

rs1961830 

rs7863238 

rs 1342022 

rs23 10333 

rs6131010 

rs6074022 

rs1 569723 

rs3746821 

rs2425764 

rs1 0876994 

rs1 2368653 

rs703842 

rs4954555 

rs7574456 

rs1519529 

rs1 193335 

rs4468527 

rs931555 

rs6897932 

rs711663 

rs9901869 

rs4239162 

rs 11079784 

rs6583176 

rs1 2490899 

rs892999 

rs1 800693 

rs4 149576 

rs7607490 

rs1 250540 



9:74870791 

9:74872500 

9:74886984 

9:74895327 

9:74897200 

20:44157712 

20:44173603 

20:44175471 

20:44188518 

20:44233852 

12:56351004 

12:56419523 

12:56449006 

2:136509584 

2:136606529 

2:136690727 

7:127404040 

14:51394881 

5:35839334 

5:35910332 

3:4437774 

17:42930205 

17:43110809 

17:43057279 

3:198009975 

3:30649258 

6:138180398 

12:6310270 

12:6319376 

2:12768571 

10:80706013 



0.0001 
0.0001 
0.0001 
0.0001 
0.0001 
8.5E-07 
1 .3E-07 
2.9E-07 
9.7E-05 
2.5E-05 
5.4E-11 
2.7E-10 
1.0E-07 
0.0001 
2.5E-05 
4.1E-05 
3.0E 
4.1 E 
2.7E 
1.7E 
1.6E 
2.9E 
6.9E 
3.6E 
3.9E 
5.7E 
3.1 E 
1.6E 
1.0E 
1.0E 
1.6E 



05 
05 
06 
06 
05 
06 
06 
05 
05 
05 
06 
11 
08 
05 
06 



Baranzini et aP 



ANZgene 5 

IMSGC 1 
ANZgene 5 



Baranzini et aP 
De Jager et aP 

De Jager et aP 
De Jager et aP 
De Jager et aP 

De Jager et aP 
De Jager et aP 

Baranzini et aP 
De Jager et aP 
De Jager et aP 
De Jager et aP 
De Jager et aP 

Baranzini et aP 
De Jager et aP 



Increased 



Increased (FIN 11 ) 



11 15 16 



Decreased 



Increased (FIN 11 ) 



Increased (FIN 9 ) 
Decreased 9 16 
Increased 12 14 

Increased (FIN 11 ) 
Increased 11 12 



Decreased 9 15 1b 
Decreased 9 15 16 
Increased (FIN 11 15 16 
Decreased (FIN 15 16 ) 

Increased 11 12 
Decreased 9 11 



The best p value is shown if several 
FIN, previously unpublished Finnish 



p values were provided. 

microarray screen described in detail here. 




(N=18) 



(N=26) 



Figure 3 Box plot of CXCR4 expression and rs882300 
genotype in 60 Centre d'Etude du Polymorphisme Humain 
(CEPH) lymphoblastoid cell samples. 



studies is a high rate of false positives and low power to 
detect true differences in small samples. Recent GWASs 
conducted in large samples have proven that most of the 
early genetic associations reported in candidate gene 
studies of at most a few hundred individuals seem to 
have been false positives. Small samples may be even 
more problematic in expression studies, which are 
susceptible to noise introduced by technical and bio- 
logical factors. Large studies are required, especially if 
the aim is to identify expression changes which are due 
to genetic disease risk variants because the effects of 
common genetic variants on gene expression are in most 
cases relatively modest, even in rather homogeneous cell 
populations, 26 and in small samples the difference in 
risk allele frequency between cases and controls is not 
expected to be significant in the first place. 

However, after excluding 13 genes in the MHC region, 
the 229 in silico replicated DEGs were enriched for 
variants showing a modest association in the IMSGC 
GWAS, 1 suggesting that at least some of these genes are 
likely to play a causative role in MS rather than showing 
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differential expression as a result of activation of 
immunological pathways secondary to MS. In addition, 
15 of these genes have shown suggestive evidence for 
association with MS (p<0.0001) including CDK4, IL7R 
and TNFRSF1A, which are located in regions of genome- 
wide significance. 1 3 5 17 These 15 also include CD40, 
which is associated with rheumatoid arthritis, 27 and 
TNFAIP3, which is associated with coeliac disease, SLE, 
psoriasis and rheumatoid arthritis. 28-31 TNFAIP3 was 
also one of the genes identified as differentially 
expressed in three studies, including our own experi- 
ment. 11 15 6 It encodes for a zinc finger protein which 
inhibits nuclear factor (NF)-k(3 activity and tumour 
necrosis factor (TNF)-mediated programmed cell death, 
and may therefore play an important role in regulating 
various immunological pathways. 32 However, apart from 
CXCR4, the expression of the 15 DEGs did not correlate 
with the proposed risk variants in lymphoblastoid cell 
lines, although potential eQTL effects should be further 
investigated in other immune cells populations. In 
CXCR4, the associated SNP correlated modestly with 
expression when measured by a probe set representing 
the 3 UTR, which may indicate differential usage of 
alternative polyadenylation signals. Interestingly, CXCR4 
promotes transendothelial migration of T cells in vitro 
together with its ligand, CXCL12, 33 while CXCR4 and 
CXCR3 antagonists reduce the accumulation of CD4+ T 
cells in the CNS and inhibit EAE pathogenesis. 34 

We acknowledge that results from our previously 
unpublished experiment may have been affected by 
technical factors as well as by the age difference between 
cases and controls. The study also included four patients 
who had received treatment. We therefore reviewed the 
list of DEGs after excluding our study and found that 
135 of the 229 DEGs were identified in at least two 
independent studies. Pathway analysis on both the 229 
and 135 in silico replicated DEGs showed that they were 
highly associated with several immunological pathways. 
Interestingly, the identified interleukin signalling path- 
ways (IL-4, IL-6 and IL-17) are primarily related to Th2 
and Thl7 cells rather than Thl cells, which are thought 
to mediate MS. 35-37 Further, IL-6 regulates the balance 
between regulatory T cell and Thl 7 cell differentiation 
together with TGF-(3. 38 Thl7 cells have been linked with 
autoimmunity, and several studies have provided 
evidence for their role in MS and EAE, 39 while regula- 
tory T cells have been demonstrated to display loss of 
suppressive function in MS. Further studies are needed 
to investigate whether changes in expression of genes in 
these interleukin signalling pathways are causative or 
secondary to MS. We also saw a strong association with 
cancer-related pathways, which may suggest some 
common molecular mechanisms behind cancer and 
autoimmunity, such as dysregulation of apoptosis 
signalling. Finally, the pathway showing most significant 
association with the in silico replicated DEGs was the 
glucocorticoid receptor signalling pathway, which is 
a central regulator of inflammation. Although one could 



speculate that this may reflect the usage of corticoste- 
roids as a treatment for MS, patients in the included 
studies had reportedly been untreated shortly prior to 
sample collection apart from our study where two 
patients had been treated with cortisone. This pathway 
was also the most significantly associated when our study, 
which included treated patients, was excluded. This 
would suggest that the regulation of endogenous 
glucocorticoid receptor signalling pathway may be 
disturbed in MS, which is in concordance with previous 
evidence of reduced glucocorticoid receptor binding 
affinity and sensitivity in lymphocytes in MS patients. 41 42 
Furthermore, mice producing an antisense RNA for the 
glucocorticoid receptor do not develop EAE. 43 Interest- 
ingly, several of the genes in confirmed MS risk loci are 
connected to the glucocorticoid receptor signalling 
pathway, including STAT3, which was recently identified 
through a GWAS by our group 4 and acts as a co-activator 
of glucocorticoid receptor signalling. 44 

In conclusion, we have performed the first systematic 
review of microarray studies in MS. In general, there was 
litde overlap between the seven studies investigated, 
most likely owing to the small sizes of these studies. 
However, 229 genes were reported to be differentially 
expressed in MS in at least two studies. After excluding 
our unpublished experiment, which may have been 
affected by confounding factors and inclusion of treated 
subjects, 135 genes were identified in at least two studies. 
Of the 229 genes, 12 were reported in at least three 
studies, including TNFAIP3, which is associated with 
several other autoimmune diseases, and NFAT1 and 
HSPA1A, which are both functionally connected to 
MHC, the major MS susceptibility locus. Pathway anal- 
yses on the 229 and 135 DEGs provided support for the 
involvement of glucocorticoid receptor signalling and 
Th2, Thl7 and regulatory T-cell-related interleukin 
signalling pathways in MS. Together with accumulating 
data from genetic association studies, our findings can 
be helpful in selecting genes and pathways for further 
functional studies in MS. 
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